Credibility Improves Topical Blog Post Retrieval

نویسندگان

  • Wouter Weerkamp
  • Maarten de Rijke
چکیده

Topical blog post retrieval is the task of ranking blog posts with respect to their relevance for a given topic. To improve topical blog post retrieval we incorporate textual credibility indicators in the retrieval process. We consider two groups of indicators: post level (determined using information about individual blog posts only) and blog level (determined using information from the underlying blogs). We describe how to estimate these indicators and how to integrate them into a retrieval approach based on language models. Experiments on the TREC Blog track test set show that both groups of credibility indicators significantly improve retrieval effectiveness; the best performance is achieved when combining them.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

New Metrics for Newsblog Credibility

The blogosphere is an invaluable source of insight into attitudes towards significant world and local events. Traditional measures of topical relevance, timeliness, specificity and credibility are inadequate when it comes to blogs, however, due to their short length, high degree of quotation, exophoricity, and the short life cycle of blog postings. In this paper, we motivate a novel metric for ...

متن کامل

Diversity-based Blog Feed Retrieval

Blog distillation (blog feed retrieval) is a task in blog retrieval where the goal is to rank blogs according to their recurrent relevance to a query topic. One of the main properties of blog feed retrieval is that the unit of retrieval is a collection of documents as opposed to a single document as in other IR tasks. This collection retrieval nature of blog distillation introduces new challeng...

متن کامل

The University of Amsterdam at TREC 2008 Blog , Enterprise , and Relevance Feedback

We describe the participation of the University of Amsterdam’s ILPS group in the blog, enterprise and relevance feedback track at TREC 2008. Our main preliminary conclusions are that estimating mixture weights for external expansion in blog post retrieval is non-trivial and we need more analysis to find out why it works better for blog distillation than for blog post retrieval. For the relevanc...

متن کامل

New metrics for blog mining

Blogs represent an important new arena for knowledge discovery in open source intelligence gathering. Bloggers are a vast network of human (and sometimes non-human) information sources monitoring important local and global events, and other blogs, for items of interest upon which they comment. Increasingly, issues erupt from the blog world and into the real world. In order to monitor blogging a...

متن کامل

The University of Amsterdam at TREC 2008

We describe the participation of the University of Amsterdam’s ILPS group in the blog, enterprise and relevance feedback track at TREC 2008. Our main preliminary conclusions are that estimating mixture weights for external expansion in blog post retrieval is non-trivial and we need more analysis to find out why it works better for blog distillation than for blog post retrieval. For the relevanc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008